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Our interest in the C. elegans heterochronic genes be- 
gan during Gary Ruvkun's post-PhD thesis defense 
seminar tour of Europe in November, 1981. He took 
along a Xerox of one paper, a 1 981 Cell paper by Marty 
Chalfie, Bob Horvitz, and John Sulston, describing the 
detailed cell lineage analysis, but not the molecular iden- 
tity, of two genes that affect C. elegans developmental 
timing, lin-4 and unc-86 (Chalfie et al., 1981). The paper 
was replete with specialized language and concepts 
that he could not decipher; his bacterial genetics training 
did not prepare him for the patois of C. elegans develop- 
mental genetics, a product of the island tribe that 
evolved around Sydney Brenner at the Medical Re- 
search Council labs in Cambridge, England. 

Ruvkun visited the MRC on that trip and spent a few 
hours talking with Marty Chalfie about lin-4 and unc-86. 
That one afternoon at the MRC planted a seed: Ruvkun 
glimpsed the worm community, its ambition, its exuber- 
ance, its collaborative reflexes, its sense of mission. And 
the field seemed ready to explode at that moment- 
there were lots of interesting mutants that were a few 
technical developments away from exciting molecular 
discovery. The attraction of C. elegans developmental 
genetics reasserted itself after Bob Horvitz gave a de- 
partmental seminar at Harvard in January of 1 982 that 
was just as confusing and interesting to Ruvkun as the 
1981 Cell paper. He went to MIT to talk about worms 
with Horvitz. Horvitz was very enthusiastic about crack- 
ing the problem of going molecular with these very 
promising genes identified by their genetics. Meeting 
with Horvitz was Ruvkun's second glimpse of the MRC 
worm culture, now transplanted to MIT by Horvitz, where 
it also transmuted to include a sense of urgency. Ruvkun 
began to work on the problem, part time in Wally Gil- 
bert's lab, where he had begun his postdoctoral work 
and which was a center of molecular biology expertise, 
and part time in the Horvitz lab, where he would learn 
worm genetics. 

Victor Ambros had just finished his genetic analysis 
of heterochronic genes (Ambros and Horvitz, 1 984). The 
most compelling of the genes was lin-14, because it 
had both gain-of-function and loss-of-function mutant 
alleles with opposite developmental timing defects. 
Such genetic attributes define switch genes, which were 
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considered the keys to development in the Horvitz group 
at that time; the virtues of such genes were recited at 
nearly every group meeting. Ambros and Horvitz had 
also discovered that lin-4 was a probable negative regu- 
lator of lin-14, so there was a developmental pathway 
to weave molecules into. Ambros was very keen to learn 
the molecular identity of lin-14 and offered to work to- 
gether with Ruvkun on the molecular analysis. 

The problem was that there was essentially no method 
to isolate a piece of DNA corresponding to a locus de- 
fined by genetics in C. elegans at that point. Transpo- 
sons had just been detected (Liao et al., 1983) and were 
thought to be responsible for spontaneous mutations, 
but there was no well-developed protocol for going mo- 
lecular. We decided to try a few strategies, some riskier 
than others. One very tricky and ambitious approach 
sought to detect DNA changes directly at the locus using 
a technique of Southern cross-blotting total DNA iso- 
lated from wild-type and mutant strains, but the techni- 
cal demands were too much to surmount and after a 
year we abandoned it (Ruvkun et al., 1990). We also 
tried other jackpot approaches. One view was that gain- 
of-function mutations might be caused by transposon 
insertions in the same way that retrovirus insertions acti- 
vate adjacent genes, so we probed every gain-of-func- 
tion mutation that had been isolated in the Horvitz lab 
with a transposon probe, seeking a new hybridization 
band. Nothing. We also sought mutations in conserved 
pathways before such concepts were so commonplace, 
probing Southern blots of many mutant C. elegans 
strains with DNA from Drosophila Notch and bithorax 
complex, as these first developmental control genes 
were isolated. We did not detect any changes in these 
genes. A C. elegans Notch homolog did emerge from 
molecular analysis of the lin-12 locus by Iva Greenwald 
at about the same time (Greenwald, 1985), so the idea 
was not entirely stupid, but it was naive to expect DNA 
homology and for this reason it did not work. Even the 
bithorax idea was pretty good, as use of more sophisti- 
cated degenerate oligonucleotide probes by Thomas 
Burglin that targeted conserved protein regions did re- 
veal many homeobox genes, including bithorax complex 
homologs (Burglin et al., 1989). 

More productive for the molecular identification of lin- 
14 was an RFLP approach. The transposon Tc1 had 
spread throughout the genome of one strain of C. ele- 
gans. We came up with a nifty way to find the Tc1 
insertion RFLPs in the genetic region closest to lin-14. 
This allowed us to bypass the identification of hundreds 
of RFLPs and jump right to the lin-14 genomic region. In 
this way, Ambros and Ruvkun constructed recombinant 
strains and cloned the particular transposon insertion 
loci that were nearest to lin-14. At the same time, Alan 
Coulson, Bob Waterston, and John Sulston were just 
beginning to assemble contigs of C. elegans cosmid 
clones. As each transposon Tc1 RFLP in the lin-14 geno- 
mic region was cloned, it could be matched with cos- 
mids and with extended contigs that contained those 
cosmids to assemble a genome map of the lin-14 region. 
We still did a bit of chromosome walking, but it was 
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more like chromosome long jumping with the contig 
assembly of the MRC genome group. The coup de grace 
of this lin-14 RFLP mapping was Ambros' identification 
of a recombinant chromosome that separated a lin-14 
gain-of-function allele from an intragenic loss-of-func- 
tion mutation he had also isolated, and Ruvkun's map- 
ping of this recombination event with other RFLPs to a 
few kilobases. The first evidence that lin-14 was identi- 
fied came from the detection of DNA changes in these 
few kilobases associated with two different lin-14 gain- 
of-function alleles (Ruvkun et al., 1989). 

At that point in 1985, Ruvkun moved on to a faculty 
position in the Department of Molecular Biology at Mas- 
sachusetts General Hospital and Department of Genet- 
ics at Harvard Medical School and recruited his first 
students and postdocs. Ambros had moved to a faculty 
position just up the road at Harvard the year before. 
Over the next few years, Bruce Wightman, Prema Arasu, 
Joe Gatto, John Giusto and Thomas Burglin determined 
that lin-14 gain-of-function mutations affect the lin-14 
3' UTR but do not affect transcript levels, suggesting 
that translation of the lin-14 mRNA is negatively regu- 
lated by lin-4 (Ruvkun and Giusto, 1 989). They found 
that the lin-14 coding region was not homologous to 
any other protein in the then very sparse databases. To 
this day, lin-14 homologs have been detected only in 
other nematodes, suggesting that the protein compo- 
nent of the lin-4/lin-14 regulatory circuit is drifting fast 
or is an invention of the Nematoda. They showed that 
LIN-14 protein is nuclearly localized, suggesting that it 
may regulate gene expression. They also showed that 
the expression of LIN-14 protein is graded over time 
and that graded expression is disrupted in the lin-4 or 
lin-14 gain-of-function mutations, but mRNA levels are 
unaffected (Wightman et al., 1991, 1993; Arasu et al., 
1991). Because the lin-14 gain-of-function mutations 
mapped to the 3' UTR and caused similar molecular 
defects as the lin-4 reduction-of-function mutation, they 
predicted that the lin-4 gene product would regulate the 
lin-14 3' UTR. But they were definitely envisioning a 
regulatory protein that might engage the lin-14 3' UTR. 
When the Ambros group showed that lin-4 actually en- 
codes an RNA, the idea of an RNA-RNA interaction 
emerged. 

On the evening of June 11,1 992, Ambros and Ruvkun 
exchanged the lin-4 and lin-14 3' UTR sequences and 
each detected the multiple elements in the lin-14 mRNA 
that are partially complementary to the lin-4 RNA. It was 
a moment when years of work came together into a clear 
model -a classic eureka moment. One lin-4 sequence 
element focused the search. We guessed that the lin- 
4(ma161) G to U point mutation would constitute an 
"active site" of the lin-4 RNA and would be located in 
any RNA duplex with the lin-14 3' UTR. It was. Other 
lin-14 sequence comparisons and mutations instantly 
validated the lin-4 complementary sites. We asked if the 
lin-4 complementary regions are affected by lin-14 gain- 
of-function mutations and are conserved in C. briggsae. 
The two lin-14 gain-of-function mutations were neatly 
explained by the sites complementary to the lin-4 RNA: 
the weaker gain-of-function allele deletes 5 of 7 comple- 
mentary sites, whereas the stronger allele removes them 
all. We had shown that the C. elegans lin-14 3' UTR is 
temporally regulated in C. briggsae and that there are 



multiple conserved sequence elements and long stretches 
with no conserved sequence. Candy Lee, Rhonda Fein- 
baum, and Ambros had also shown that the lin-4 RNA 
sequence is conserved in C. briggsae, so we expected 
that lin- 14 sequence elements important for lin-4 regula- 
tion would be conserved (Wightman et al., 1993). They 
were. Perfectly. In fact, the conservation of the lin-14 
mRNA sequences strongly supported the existence of 
distinct RNA duplex structures (e.g., bulged C versus 
more perfect duplexes), which we later proved to be 
correct (Ha et al., 1996). 

To further test the model and explore the mechanism 
of the regulation, Wightman and llho Ha fused the lin- 
14 3' UTR onto a reporter gene and showed that it 
is sufficient to generate graded temporal expression 
(Wightman et al., 1993). This showed that the lin-4 anti- 
sense RNA does not depend on other lin-14 mRNA se- 
quences (the 5' end, for example), constraining the 
mechanism of lin-4/lin-14 RNA duplex regulation of 
translation. By monitoring LIN-14 protein and mRNA 
levels, Wightman also showed that the interaction does 
not regulate lin-14 mRNA abundance, but rather transla- 
tion of the lin-14 mRNA. By monitoring p-galactosidase 
activity and lacZ mRNA expression from a lacZ/lin-14 
3' UTR fusion gene in a wild-type or the lin-4 mutant 
background, Wightman and Ha also showed that lin-4 
acts posttranscriptionally via the lin-14 3' UTR to gener- 
ate graded temporal expression. This model was later 
validated and extended by Phil Olsen and Ambros, who 
showed that the translational control of lin-14 occurs at 
a postinitiation step, because the lin-14 mRNA that is 
not translated after lin-4 expression is upregulated is 
paradoxically localized to polysomes (Olsen and Ambros, 
1999). The localization of miRNAs to polysomes has 
recently been shown to be true for mammalian brain 
miRNAs as well, suggesting that it is general to miRNAs 
(Kim et al., 2003). 

The combination of the Lee et al. (1 993) and Wightman 
et al. (1 993) Cell papers made a strong case for a direct 
interaction between the lin-4 RNA and the lin-14 mRNA. 
In the usual bloodbath of publishing, the reviewers 
wanted more, more, more, and the editors, as usual, 
sent along noncommittal form letters to prompt us to 
do more experiments. So we wrote the usual missives 
passionately arguing for publication. The most germane 
request, which we had expected, was for proof of the 
RNA duplex model by constructing compensatory muta- 
tions in lin-4 and lin-14. In fact, the Ambros and Ruvkun 
labs tried to do this together but, due to instability of the 
two different and increasingly sophisticated engineered 
lin-4 mutants, we could not do the elegant experiment. 
We later did show that the lin-4 RNA binds to the lin-14 
mRNA in vitro and that a lin-14 3' UTR with mutations 
in each of the lin-4 binding sites is no longer downregu- 
lated by lin-4 in vivo nor bound in vitro (Ha et al., 1996). 
But because there were many experimental supports 
of the model besides allele-specific suppression, and 
because the strategy failed a few times, we abandoned 
elegance. Ruvkun continues to argue that elegance in 
molecular genetics is aesthetically pleasing but scientifi- 
cally overrated. 

After the two papers were published, Marv Wickens 
published a very nice News and Views in Nature that 
ended with the conjecture that tiny RNAs and their tar- 
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gets might be more extensive than just lin-4 and lin-14 
(Wickens and Takayama, 1 994). But the discovery of the 
world's first microRNA did not trigger a gold rush, not 
even by the Ambros or Ruvkun labs. First, the hetero- 
chronic pathway was a rather parochial object of study; 
while the lin-4 and lin-14 stories were published in high- 
profile journals, they were viewed as a novelty rather 
than a harbinger. The antisense regulation was similar 
to some prokaryotic gene regulatory vignettes, if one 
ignored how incredibly small lin-4 was (and lin-4 was 
four times smaller than any other noncoding regulatory 
RNA). Without homologs in other species, its generality 
did not emerge. We still did not suspect an extensive 
microRNA world even after Brenda Reinhart, Frank 
Slack, and others detected a second microRNA, let-7 
(Reinhart et al., 2000), because it emerged from genetic 
analysis of the same C. elegans heterochronic pathway; 
tiny RNAs could still have been inventions of this one 
pathway in this one species. 

We began to suspect an extensive tiny RNA world 
after genome database searches using the /ef-7 miRNA 
sequence revealed perfect 22 nt matches in the newly 
emerging Drosophila and human genome sequence. 
The genome regions adjacent to the perfect matches 
could also fold into bulged and loopy precursors that 
looked a lot like the probable lin-4 and let-7 precursors of 
C. elegans, suggesting that these 22 nt perfect matches 
were not spurious. It is important to stress that the de- 
tection of these homologs demanded full genome data- 
bases, not the biased "protein world" information of 
EST databases. Amy Pasquinelli, Brenda Reinhart, and 
Ruvkun confirmed that the fly and human let-7 homologs 
detected in databases express a 22 nt RNA, and in 
collaboration with a large number of people who sent 
RNA samples, they showed that /ef-7 is conserved 
across most of animal phylogeny, analyzing RNAs from 
a very satisfying range of nondomesticated animals 
such as coral, mollusks, annelids, acorn worms, you 
name it (Pasquinelli et al., 2000). Even more surprising 
were the findings by Pasquinelli and Reinhart that the 
temporal regulation of let-7 is also conserved in a wide 
range of species and by Slack and coworkers that com- 
plementary sites in the target of tef-7, the lin-41 mRNA, 
are also conserved across phylogeny, strongly sug- 
gesting an ancient function in temporal patterning (Pas- 
quinelli et al., 2000; Slack et al., 2000). 

The conservation of the /ef-7 RNA was the key finding 
that argued the generality of miRNAs for our group. It 
was then that we began a collaboration with the Church 
lab to search for more microRNA genes by informatics 
(Grad et al., 2003). Biochemical searches for miRNAs 
by the Ambros, Bartel, and Tuschl labs identified some 
of this mother lode before we did (Lau et al., 2001 ; Lagos- 
Quintana et al., 2001; Lee and Ambros, 2001). Now we 
know that there are hundreds, perhaps thousands, of 
miRNA genes in various genomes, about a third of which 
are conserved. A few of these genes have now emerged 
from genetic analysis in Arabidopsis and Drosophila, 
but many more have unknown functions (Ruvkun, 2001). 
And while the paradigm from lin-4/lin-14 remains the 
model, miRNAs have now been shown to control mRNA 
abundance in plants, and they could regulate many more 
RNA steps than translation. In addition, the assignment 
of the related siRNAs to chromatin silencing in S. pombe 



suggests that miRNAs could act beyond the control of 
mRNA abundance or translation (Volpe et al., 2002). 

An even deeper connection to RNAi started with nu- 
merological considerations (it cannot be called reason- 
ing). When siRNAs of 22 nt, the same size as lin-4 and 
/ef-7, were discovered by the Baulcombe and Tuschl 
groups in 1999 and 2001 (Hamilton and Baulcombe, 
1999; Elbashir et al., 2001), Ruvkun noted that the num- 
ber 22 (the number of letters in the Hebrew alphabet) is 
stressed in the Kabbalah, a Jewish mystical tradition 
celebrated in medieval Spain, alternative bookstores, 
and a number of helpful Web sites (e.g., http://pws. 
prserv.net/leon/Kabbalah-articles/treeof.html). We be- 
gan to explore the action of the RNAi machinery in 
miRNA maturation and activity. Amy Pasquinelli looked 
closely at the first RNAi-defective mutants, rde-J and 
rde-4, but could not detect any heterochronic defects 
nor any change in lin-4 or /ef-7 miRNA activity or pro- 
cessing. At that point, Alia Grishok and Craig Mello con- 
tacted us after they discovered that RNAi inactivation of 
one of 28 different RDE-1 paralogs causes a phenotype 
similar to the let-7 lethality. Grishok and Pasquinelli then 
showed that RNAi inactivation of a pair of RDE-1 para- 
logs and C. elegans Dicer disrupt miRNA processing 
and activity, proving that the RNAi and miRNA pathways 
are related (Grishok et al., 2001). The intersection with 
RNAi dramatically increased the interest in miRNAs; with 
so many labs using RNAi as a tool, the mechanism by 
which miRNAs and siRNAs inhibit gene function has a 
large audience. Still emerging from genetic and func- 
tional genomic analysis are the components of the Dicer 
and RISC complexes that process dsRNA and miRNA 
precursors and present them to mRNAs, as well as com- 
ponents that recognize miRNA::mRNA duplexes to 
downregulate translation on polyribosomes. 

It is now clear an extensive miRNA world was flying 
almost unseen by our genetic radar. As much as geneti- 
cists like to think that nothing can escape genetic analy- 
sis, the miRNA genes are so small that they almost 
escaped our notice. Now, more miRNA genes are emerg- 
ing from loss-of-function genetics (Johnston and Hob- 
ert, 2003) as well as from gain-of-function genetics fo- 
cused on mutations in the target genes that are 
negatively regulated by miRNAs (Llave et al., 2002) or 
based on misexpression of the miRNAs (Brennecke et 
al., 2003). The families of miRNAs that have emerged 
from informatic and biochemical analyses suggest much 
gene duplication and divergence, as has been seen in 
other large families of regulatory genes in multicellular 
organisms, such as transcription factor genes. The 22 
nt length is just about right for specificity to particular 
target genes as well as for plasticity to evolve towards 
new gene targets. The number of protein and large RNA 
coding genes in multicellular animals and plants is sur- 
prisingly small, only an order of magnitude more than 
in microbial genomes. The flowering of the diverse and 
numerous miRNA genes in animals and plants may turn 
out to mediate much of the gene regulation that gener- 
ates cell diversity and developmental patterning, as well 
as the gene regulation underlying other recent inven- 
tions in animals such as synaptic signaling and its modu- 
lation. 
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